Investigating the shortcomings of HMM synthesis
نویسندگان
چکیده
This paper presents the beginnings of a framework for formal testing of the causes of the current limited quality of HMM (Hidden Markov Model) speech synthesis. This framework separates each of the effects of modelling to observe their independent effects on vocoded speech parameters in order to address the issues that are restricting the progression to highly intelligible and natural-sounding speech synthesis. The simulated HMM synthesis conditions are performed on spectral speech parameters and tested via a pairwise listening test, asking listeners to perform a “same or different” judgement on the quality of the synthesised speech produced between these conditions. These responses are then processed using multidimensional scaling to identify the qualities in modelled speech that listeners are attending to and thus forms the basis of why they are distinguishable from natural speech. The future improvements to be made to the framework will finally be discussed which include the extension to more of the parameters modelled during speech synthesis.
منابع مشابه
Autoregressive clustering for HMM speech synthesis
The autoregressive HMM has been shown to provide efficient parameter estimation and high-quality synthesis, but in previous experiments decision trees derived from a non-autoregressive system were used. In this paper we investigate the use of autoregressive clustering for autoregressive HMM-based speech synthesis. We describe decision tree clustering for the autoregressive HMM and highlight dif...
متن کاملSyllable based models for prosody modeling in HMM based speech synthesis
Simple4All is a speech synthesis research project that aims to ease the production of synthetic voices in new languages by means of unsupervised modeling techniques. In this work, we introduce syllable based models for prosody modeling in Hidden Markov Model based Text-to-Speech system (HTS). As a part of investigating the potential for building speech synthesis systems in new languages with li...
متن کاملOn learning discontinuous human control strategies
Ž . Models of human control strategy HCS , which accurately emulate dynamic human behavior, have far reaching potential in areas ranging from robotics to virtual reality to the intelligent vehicle highway project. A number of learning algorithms, including fuzzy logic, neural networks, and locally weighted regression exist for modeling continuous human control strategies. These algorithms, howe...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملInvestigating HMMs as a parametric model for expressive speech synthesis in German
The paper investigates the potential of HMM based synthesis to support the parameterisation of expressive speech in German. First, we review the assets of HMMs in the perspective of previous works in speech modelling and speech transformation. It is shown that HMMs define a flexible parametric model of the speech acoustics, which readily integrates several levels of speech modelling, such as di...
متن کامل